Skip to main content

Survey Crawls

Survey crawls are run about twice a year, on average, and attempt to capture the content of the front page of every web host ever seen by the Internet Archive since 1996.



rss RSS

Show sorted alphabetically
Show sorted alphabetically
SHOW DETAILS
up-solid down-solid
eye
Title
Date Archived
Creator
Survey Crawl Number 8
collection
8,758
ITEMS
891.8M
VIEWS
collection
eye 891.8M
collection
eye 1.7B
The seed for this crawl was a list of every host in the Wayback Machine This crawl was run at a level 1 (URLs including their embeds, plus the URLs of all outbound links including their embeds) The WARC files associated with this crawl are not currently available to the general public.
collection
eye 1B
The seeds for this crawl came from: 251 million Domains that had at least one link from a different domain in the Wayback Machine, across all time ~ 300 million Domains that we had in the Wayback, across all time 55,945,067 Domains from https://archive.org/details/wide00016 This crawl was run with a Heritrix setting of "maxHops=0" (URLs including their embeds) The WARC files associated with this crawl are not currently available to the general public.
collection
eye 1.2B
The seed for this crawl was a list of every host in the Wayback Machine This crawl was run at a level 1 (URLs including their embeds, plus the URLs of all outbound links including their embeds) The WARC files associated with this crawl are not currently available to the general public.
collection
eye 926.3M
The seed for this crawl was a list of every host in the Wayback Machine This crawl was run at a level 1 (URLs including their embeds, plus the URLs of all outbound links including their embeds) The WARC files associated with this crawl are not currently available to the general public.
Survey Crawl Number 7
Survey Crawl Number 7
collection
6,605
ITEMS
720.5M
VIEWS
collection
eye 720.5M
This "Survey" crawl was started on Feb. 24, 2018. This crawl was run with a Heritrix setting of "maxHops=0" (URLs including their embeds) Survey 7 is based on a seed list of 339,249,218 URLs which is all the URLs in the Wayback Machine that we saw a 200 response code from in 2017 based on a query we ran on Feb. 1st, 2018.   The WARC files associated with this crawl are not currently available to the general public.
collection
eye 672.4M
The seed for this crawl was a list of every host in the Wayback Machine This crawl was run at a level 1 (URLs including their embeds, plus the URLs of all outbound links including their embeds) The WARC files associated with this crawl are not currently available to the general public.
collection
eye 579.9M
The seed for this crawl was a list of every host in the Wayback Machine This crawl was run at a level 1 (URLs including their embeds, plus the URLs of all outbound links including their embeds) The WARC files associated with this crawl are not currently available to the general public.
Survey Crawl Number 9
collection
561
ITEMS
147.7M
VIEWS
collection
eye 147.7M
.com survey started January 2011
.com survey started January 2011
collection
2,535
ITEMS
440.2M
VIEWS
collection
eye 440.2M
Survey crawl of .com domains started January 2011.
Topic: webcrawl
collection
eye 313.6M
The seed for this crawl was a list of every host in the Wayback Machine This crawl was run at a level 1 (URLs including their embeds, plus the URLs of all outbound links including their embeds) The WARC files associated with this crawl are not currently available to the general public.
survey_net00000
survey_net00000
collection
300
ITEMS
53.3M
VIEWS
collection
eye 53.3M
Survey crawl of .net domains started December 2010.
Topic: webcrawl
survey_net00001
collection
170
ITEMS
19.3M
VIEWS
collection
eye 19.3M
Survey crawl of .net domains started October 2011.
Topics: webwidecrawl, net
COM Survey Crawl 2009-2010
COM Survey Crawl 2009-2010
collection
729
ITEMS
68.3M
VIEWS
collection
eye 68.3M
COM survey crawl data collected by Internet Archive in 2009-2010. This data is currently not publicly accessible.
survey_00010
web
eye 8.1M
favorite 0
comment 0
"Internet Archive crawldata from feed-driven by 1.2 million top ranked domains from data.domainrank.io - captured by crawl423.us.archive.org:survey_00010 from Mon May 11 14:14:43 PDT 2020 to Mon May 11 09:09:55 PDT 2020."
Topics: survey_00010, crawldata
ORG Survey Crawls
ORG Survey Crawls
collection
191
ITEMS
29.8M
VIEWS
collection
eye 29.8M
Survey of .org domains. This data is currently not publicly accessible.
Survey Crawl Number 8
web
eye 643,725
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl344.us.archive.org:survey from Thu Jan 10 16:19:12 PST 2019 to Thu Jan 10 14:12:18 PST 2019.
Topic: crawldata
Survey Crawl Number 8
web
eye 647,093
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl339.us.archive.org:survey from Tue Jan 22 09:37:30 PST 2019 to Tue Jan 22 03:52:56 PST 2019.
Topic: crawldata
Survey Crawl Number 8
web
eye 649,187
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl818.us.archive.org:survey from Sat Jan 12 11:20:21 PST 2019 to Sat Jan 12 08:14:51 PST 2019.
Topic: crawldata
Survey Crawl Number 8
web
eye 601,320
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl835.us.archive.org:survey from Wed Jan 16 13:02:07 PST 2019 to Wed Jan 16 13:30:45 PST 2019.
Topic: crawldata
Survey Crawl Number 8
web
eye 1M
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl818.us.archive.org:survey from Tue Jan 8 08:38:27 PST 2019 to Tue Jan 8 07:59:00 PST 2019.
Topic: crawldata
Survey Crawl Number 6: Sep 11th, 2017 - running now
web
eye 307,742
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl344.us.archive.org:survey from Sat Sep 30 10:02:40 PDT 2017 to Sat Sep 30 03:34:44 PDT 2017.
Topic: crawldata
Survey Crawl Number 8
web
eye 753,340
favorite 1
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl838.us.archive.org:survey from Fri Jan 18 06:42:06 PST 2019 to Fri Jan 18 00:46:03 PST 2019.
Topic: crawldata
Survey Crawl Number 8
web
eye 751,162
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl835.us.archive.org:survey from Tue Jan 29 00:07:09 PST 2019 to Mon Jan 28 17:38:10 PST 2019.
Topic: crawldata
Survey Crawl Number 7
web
eye 1.5M
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl842.us.archive.org:survey from Tue Apr 10 00:41:08 PDT 2018 to Mon Apr 9 22:52:42 PDT 2018.
Topic: crawldata
Survey Crawl Number 8
web
eye 647,549
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl825.us.archive.org:survey from Tue Jan 15 00:32:09 PST 2019 to Tue Jan 15 02:46:53 PST 2019.
Topic: crawldata
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl453.us.archive.org:survey from Mon May 26 22:33:15 PDT 2014 to Mon May 26 23:08:51 PDT 2014.
Topic: crawldata
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl419.us.archive.org:survey from Tue May 27 01:34:25 PDT 2014 to Mon May 26 22:52:57 PDT 2014.
Topic: crawldata
Survey Crawl Number 7
web
eye 109,986
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl-hq10.us.archive.org:survey from Sat Feb 24 13:48:03 PST 2018 to Sat Feb 24 05:55:38 PST 2018.
Topic: crawldata
Survey Crawl Number 8
web
eye 815,851
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl818.us.archive.org:survey from Tue Jan 29 11:32:54 PST 2019 to Tue Jan 29 05:46:54 PST 2019.
Topic: crawldata
Survey Crawl Number 8
web
eye 894,957
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl836.us.archive.org:survey from Fri Feb 22 02:17:40 PST 2019 to Thu Feb 21 23:46:50 PST 2019.
Topic: crawldata
Survey Crawl Number 7
web
eye 3M
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl841.us.archive.org:survey from Mon May 14 17:51:22 PDT 2018 to Mon May 14 16:01:52 PDT 2018.
Topic: crawldata
survey_00010
web
eye 49,792
favorite 0
comment 0
"Internet Archive crawldata from feed-driven by 1.2 million top ranked domains from data.domainrank.io - captured by crawl421.us.archive.org:survey_00010 from Sun May 24 19:26:13 PDT 2020 to Sun May 24 15:10:12 PDT 2020."
Topics: survey_00010, crawldata
Survey Crawl Number 8
web
eye 581,795
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl344.us.archive.org:survey from Tue Jan 22 12:26:21 PST 2019 to Tue Jan 22 07:21:07 PST 2019.
Topic: crawldata
Survey Crawl Number 8
web
eye 567,317
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl825.us.archive.org:survey from Mon Jan 28 23:46:07 PST 2019 to Mon Jan 28 17:26:58 PST 2019.
Topic: crawldata
Survey Crawl Number 8
web
eye 527,226
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl339.us.archive.org:survey from Sat Jan 12 18:55:42 PST 2019 to Sat Jan 12 19:13:13 PST 2019.
Topic: crawldata
Survey Crawl Number 7
web
eye 3M
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl840.us.archive.org:survey from Tue May 15 20:39:06 PDT 2018 to Tue May 15 21:00:30 PDT 2018.
Topic: crawldata
Survey Crawl Number 8
web
eye 578,867
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl339.us.archive.org:survey from Wed Nov 21 15:02:49 PST 2018 to Wed Nov 21 10:16:09 PST 2018.
Topic: crawldata
Survey Crawl Number 5: Oct 21st, 2016 to Sep 10th, 2017
web
eye 150,115
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl817.us.archive.org:survey from Fri Jul 14 06:02:55 PDT 2017 to Fri Jul 14 02:22:22 PDT 2017.
Topic: crawldata
Survey Crawl Number 6: Sep 11th, 2017 - running now
web
eye 230,038
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl838.us.archive.org:survey from Fri Jan 5 09:25:00 PST 2018 to Fri Jan 5 01:50:03 PST 2018.
Topic: crawldata
Survey Crawl Number 8
web
eye 898,236
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl835.us.archive.org:survey from Fri Feb 8 21:12:51 PST 2019 to Fri Feb 8 14:44:22 PST 2019.
Topic: crawldata
Survey Crawl Number 8
web
eye 228,826
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl824.us.archive.org:survey from Sat Jan 26 15:02:34 PST 2019 to Sat Jan 26 08:47:21 PST 2019.
Topic: crawldata
survey_00010
web
eye 295,198
favorite 0
comment 0
"Internet Archive crawl data from feed-driven by 1.2 million top ranked domains from \data.domainrank.io/\"  captured by crawl420.us.archive.org:survey_00010 from Wed May 13 15:35:36 PDT 2020 to Wed May 13 10:09:02 PDT 2020."
Topics: domainrank_top_mil, survey_00010, crawldata
Survey Crawl Number 8
web
eye 413,091
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl835.us.archive.org:survey from Thu Jan 31 07:15:17 PST 2019 to Thu Jan 31 01:24:44 PST 2019.
Topic: crawldata
Survey Crawl Number 8
web
eye 227,480
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl818.us.archive.org:survey from Thu Jan 24 00:44:19 PST 2019 to Wed Jan 23 18:36:19 PST 2019.
Topic: crawldata
Survey Crawl Number 7
web
eye 1.5M
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl-hq10.us.archive.org:survey from Sat Feb 24 04:16:28 PST 2018 to Fri Feb 23 20:23:38 PST 2018.
Topic: crawldata
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl420.us.archive.org:survey from Wed Jan 21 06:38:25 PST 2015 to Tue Jan 20 23:13:31 PST 2015.
Topic: crawldata
Survey Crawl Number 7
web
eye 201,131
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl843.us.archive.org:survey from Fri May 11 14:54:48 PDT 2018 to Fri May 11 09:38:10 PDT 2018.
Topic: crawldata
Survey Crawl Number 8
web
eye 190,052
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl344.us.archive.org:survey from Thu Nov 29 20:01:41 PST 2018 to Thu Nov 29 17:30:37 PST 2018.
Topic: crawldata
Survey Crawl Number 6: Sep 11th, 2017 - running now
web
eye 169,371
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl339.us.archive.org:survey from Sat Oct 14 11:17:53 PDT 2017 to Sat Oct 14 04:45:39 PDT 2017.
Topic: crawldata
Survey Crawl Number 8
web
eye 180,303
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl825.us.archive.org:survey from Mon Nov 26 06:50:00 PST 2018 to Mon Nov 26 06:16:37 PST 2018.
Topic: crawldata
survey_00010
web
eye 174,338
favorite 0
comment 0
Internet Archive crawldata from feed-driven by 1.2 million top ranked domains from "data.domainrank.io/" , captured by crawl423.us.archive.org:survey_00010 from Tue Sep 15 16:20:44 PDT 2020 to Tue Sep 15 20:46:31 PDT 2020.
Topics: survey_00010, crawldata
Survey Crawl Number 7
web
eye 344,235
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl849.us.archive.org:survey from Tue Jul 17 16:57:11 PDT 2018 to Tue Jul 17 11:36:35 PDT 2018.
Topic: crawldata
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl427.us.archive.org:survey from Sat Feb 7 22:35:36 PST 2015 to Sat Feb 7 22:31:26 PST 2015.
Topic: crawldata
Survey Crawl Number 8
web
eye 244,547
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl824.us.archive.org:survey from Sat Jan 26 10:01:13 PST 2019 to Sat Jan 26 05:36:23 PST 2019.
Topic: crawldata
Survey Crawl Number 7
web
eye 362,483
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl843.us.archive.org:survey from Tue Jul 17 18:09:36 PDT 2018 to Tue Jul 17 13:08:03 PDT 2018.
Topic: crawldata
Survey Crawl Number 7
web
eye 438,045
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl841.us.archive.org:survey from Tue Jul 17 16:28:45 PDT 2018 to Tue Jul 17 11:24:40 PDT 2018.
Topic: crawldata
Survey Crawl Number 6: Sep 11th, 2017 - running now
web
eye 163,517
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl344.us.archive.org:survey from Mon Oct 16 01:15:40 PDT 2017 to Sun Oct 15 18:45:40 PDT 2017.
Topic: crawldata
Survey Crawl Number 5: Oct 21st, 2016 to Sep 10th, 2017
web
eye 168,390
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl836.us.archive.org:survey from Wed Oct 26 19:56:48 PDT 2016 to Wed Oct 26 15:40:51 PDT 2016.
Topic: crawldata
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl836.us.archive.org:survey from Sat Aug 1 00:43:43 PDT 2015 to Sat Aug 1 11:44:39 PDT 2015.
Topic: crawldata
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl805.us.archive.org:survey from Mon Jan 11 21:07:34 PST 2016 to Wed Jan 13 02:53:27 PST 2016.
Topic: crawldata
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl805.us.archive.org:survey from Fri Feb 19 11:46:14 PST 2016 to Fri Feb 19 12:08:16 PST 2016.
Topic: crawldata
Survey Crawl Number 6: Sep 11th, 2017 - running now
web
eye 167,946
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl824.us.archive.org:survey from Sat Oct 14 11:26:47 PDT 2017 to Sat Oct 14 04:54:28 PDT 2017.
Topic: crawldata
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl835.us.archive.org:survey from Fri Jun 9 20:38:43 PDT 2017 to Sat Jun 10 02:19:04 PDT 2017.
Topic: crawldata
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl835.us.archive.org:survey from Wed May 31 21:03:46 PDT 2017 to Sun Jun 4 02:33:07 PDT 2017.
Topic: crawldata
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl421.us.archive.org:survey from Sat Jan 9 03:10:06 PST 2016 to Sat Jan 9 11:11:11 PST 2016.
Topic: crawldata
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl423.us.archive.org:survey from Sat Jan 9 03:10:11 PST 2016 to Sat Jan 9 10:55:21 PST 2016.
Topic: crawldata
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl808.us.archive.org:survey from Sat Dec 27 23:31:46 PST 2014 to Sat Dec 27 21:07:24 PST 2014.
Topic: crawldata
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl455.us.archive.org:survey from Sun Dec 28 03:48:06 PST 2014 to Sat Dec 27 23:09:20 PST 2014.
Topic: crawldata
Survey Crawl Number 5: Oct 21st, 2016 to Sep 10th, 2017
web
eye 943,046
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl817.us.archive.org:survey from Tue May 23 05:54:32 PDT 2017 to Sat May 27 09:25:16 PDT 2017.
Topic: crawldata
Survey Crawl Number 5: Oct 21st, 2016 to Sep 10th, 2017
web
eye 992,692
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl836.us.archive.org:survey from Tue Apr 18 22:38:48 PDT 2017 to Sat Apr 22 00:04:16 PDT 2017.
Topic: crawldata
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl413.us.archive.org:survey from Tue Jun 3 04:58:57 PDT 2014 to Tue Jun 3 01:09:27 PDT 2014.
Topic: crawldata
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl806.us.archive.org:survey from Sat Aug 1 00:46:36 PDT 2015 to Sat Aug 1 12:04:06 PDT 2015.
Topic: crawldata
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl824.us.archive.org:survey from Wed Apr 26 12:21:44 PDT 2017 to Mon May 15 13:42:33 PDT 2017.
Topic: crawldata
Survey Crawl Number 5: Oct 21st, 2016 to Sep 10th, 2017
web
eye 939,681
favorite 0
comment 0
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl835.us.archive.org:survey from Tue May 16 08:16:19 PDT 2017 to Wed May 17 05:32:51 PDT 2017.
Topic: crawldata